Goto

Collaborating Authors

 case information


LegalAgentBench: Evaluating LLM Agents in Legal Domain

Li, Haitao, Chen, Junjie, Yang, Jingli, Ai, Qingyao, Jia, Wei, Liu, Youfeng, Lin, Kai, Wu, Yueyue, Yuan, Guozhi, Hu, Yiran, Wang, Wuyue, Liu, Yiqun, Huang, Minlie

arXiv.org Artificial Intelligence

With the increasing intelligence and autonomy of LLM agents, their potential applications in the legal domain are becoming increasingly apparent. However, existing general-domain benchmarks cannot fully capture the complexity and subtle nuances of real-world judicial cognition and decision-making. Therefore, we propose LegalAgentBench, a comprehensive benchmark specifically designed to evaluate LLM Agents in the Chinese legal domain. LegalAgentBench includes 17 corpora from real-world legal scenarios and provides 37 tools for interacting with external knowledge. We designed a scalable task construction framework and carefully annotated 300 tasks. These tasks span various types, including multi-hop reasoning and writing, and range across different difficulty levels, effectively reflecting the complexity of real-world legal scenarios. Moreover, beyond evaluating final success, LegalAgentBench incorporates keyword analysis during intermediate processes to calculate progress rates, enabling more fine-grained evaluation. We evaluated eight popular LLMs, highlighting the strengths, limitations, and potential areas for improvement of existing models and methods. LegalAgentBench sets a new benchmark for the practical application of LLMs in the legal domain, with its code and data available at \url{https://github.com/CSHaitao/LegalAgentBench}.


Leverage Knowledge Graph and Large Language Model for Law Article Recommendation: A Case Study of Chinese Criminal Law

Chen, Yongming, Chen, Miner, Zhu, Ye, Pei, Juan, Chen, Siyu, Zhou, Yu, Wang, Yi, Zhou, Yifan, Li, Hao, Zhang, Songan

arXiv.org Artificial Intelligence

Court efficiency is vital for social stability. However, in most countries around the world, the grassroots courts face case backlogs, with decisions relying heavily on judicial personnel's cognitive labor, lacking intelligent tools to improve efficiency. To address this issue, we propose an efficient law article recommendation approach utilizing a Knowledge Graph (KG) and a Large Language Model (LLM). Firstly, we propose a Case-Enhanced Law Article Knowledge Graph (CLAKG) as a database to store current law statutes, historical case information, and correspondence between law articles and historical cases. Additionally, we introduce an automated CLAKG construction method based on LLM. On this basis, we propose a closed-loop law article recommendation method. Finally, through a series of experiments using judgment documents from the website "China Judgements Online", we have improved the accuracy of law article recommendation in cases from 0.549 to 0.694, demonstrating that our proposed method significantly outperforms baseline approaches.


Grammatical cues to subjecthood are redundant in a majority of simple clauses across languages

Mahowald, Kyle, Diachek, Evgeniia, Gibson, Edward, Fedorenko, Evelina, Futrell, Richard

arXiv.org Artificial Intelligence

Grammatical cues are sometimes redundant with word meanings in natural language. For instance, English word order rules constrain the word order of a sentence like "The dog chewed the bone" even though the status of "dog" as subject and "bone" as object can be inferred from world knowledge and plausibility. Quantifying how often this redundancy occurs, and how the level of redundancy varies across typologically diverse languages, can shed light on the function and evolution of grammar. To that end, we performed a behavioral experiment in English and Russian and a cross-linguistic computational analysis measuring the redundancy of grammatical cues in transitive clauses extracted from corpus text. English and Russian speakers (n=484) were presented with subjects, verbs, and objects (in random order and with morphological markings removed) extracted from naturally occurring sentences and were asked to identify which noun is the subject of the action. Accuracy was high in both languages (~89% in English, ~87% in Russian). Next, we trained a neural network machine classifier on a similar task: predicting which nominal in a subject-verb-object triad is the subject. Across 30 languages from eight language families, performance was consistently high: a median accuracy of 87%, comparable to the accuracy observed in the human experiments. The conclusion is that grammatical cues such as word order are necessary to convey subjecthood and objecthood in a minority of naturally occurring transitive clauses; nevertheless, they can (a) provide an important source of redundancy and (b) are crucial for conveying intended meaning that cannot be inferred from the words alone, including descriptions of human interactions, where roles are often reversible (e.g., Ray helped Lu/Lu helped Ray), and expressing non-prototypical meanings (e.g., "The bone chewed the dog.").


Clinical diagnostics and the artificial intelligence boom

#artificialintelligence

Leica Microsystems of Wetzlar, Germany, markets Digital Image Hub Enterprise, its clinical workflow software for digital pathology. It provides image management, integration, and communication capabilities in conjunction with the client viewer, SlidePath Gateway. Digital Image Hub Enterprise utilizes slide barcodes to identify and automatically consolidate multiple slides within a case into a work list. Pathologists can also view their digital slides through internet browsers and on an iPad, allowing access anytime and anywhere.


A preferential, pattern-seeking semantics for natural language inference

Wilks, Y. A.

Classics

Syntax, Preference and Right Attachment Yorick Wilks, Xiuming Huang & Dan Fass Computing Research Laboratory New Mexico State University Las Cruces, NM, USA 88003 ABSTRACT The paper claims that the right attachment rules for phrases originally suggested by Frazier and Fodor are wrong, and that none of the subsequent patchings of the rules by syntactic methods have improved the situation. For each rule there are perfectly straightforward and indefinitely large classes of simple counterexamples. We then examine suggestions by Ford et a!., Schubert and Hirst which are quasi-semantic in nature and which we consider ingenious but unsatisfactory. We offer a straightforward solution within the framework of preference semantics, and argue that the principal issue is not the type and nature of information required to get appropriate phrase attachments, but the issue of where to store the information and with what processes to apply it. We present a prolog implementation of a best first algorithm covering the data and contrast it with closely related ones, all of which are based on the preferences of nouns and prepositions, as well as verbs.